90 research outputs found
Differentiable Meta logical Programming
Deep learning uses an increasing amount of computation and data to solve very
specific problems. By stark contrast, human minds solve a wide range of
problems using a fixed amount of computation and limited experience. One
ability that seems crucial to this kind of general intelligence is
meta-reasoning, i.e., our ability to reason about reasoning. To make deep
learning do more from less, we propose the differentiable logical meta
interpreter (DLMI). The key idea is to realize a meta-interpreter using
differentiable forward-chaining reasoning in first-order logic. This directly
allows DLMI to reason and even learn about its own operations. This is
different from performing object-level deep reasoning and learning, which
refers in some way to entities external to the system. In contrast, DLMI is
able to reflect or introspect, i.e., to shift from meta-reasoning to
object-level reasoning and vice versa. Among many other experimental
evaluations, we illustrate this behavior using the novel task of "repairing
Kandinsky patterns," i.e., how to edit the objects in an image so that it
agrees with a given logical concept
Rebalanced Zero-shot Learning
Zero-shot learning (ZSL) aims to identify unseen classes with zero samples
during training. Broadly speaking, present ZSL methods usually adopt
class-level semantic labels and compare them with instance-level semantic
predictions to infer unseen classes. However, we find that such existing models
mostly produce imbalanced semantic predictions, i.e. these models could perform
precisely for some semantics, but may not for others. To address the drawback,
we aim to introduce an imbalanced learning framework into ZSL. However, we find
that imbalanced ZSL has two unique challenges: (1) Its imbalanced predictions
are highly correlated with the value of semantic labels rather than the number
of samples as typically considered in the traditional imbalanced learning; (2)
Different semantics follow quite different error distributions between classes.
To mitigate these issues, we first formalize ZSL as an imbalanced regression
problem which offers empirical evidences to interpret how semantic labels lead
to imbalanced semantic predictions. We then propose a re-weighted loss termed
Re-balanced Mean-Squared Error (ReMSE), which tracks the mean and variance of
error distributions, thus ensuring rebalanced learning across classes. As a
major contribution, we conduct a series of analyses showing that ReMSE is
theoretically well established. Extensive experiments demonstrate that the
proposed method effectively alleviates the imbalance in semantic prediction and
outperforms many state-of-the-art ZSL methods. Our code is available at
https://github.com/FouriYe/ReZSL-TIP23.Comment: Accepted to IEEE Transactions on Image Processing (TIP) 202
Turn-Level Active Learning for Dialogue State Tracking
Dialogue state tracking (DST) plays an important role in task-oriented
dialogue systems. However, collecting a large amount of turn-by-turn annotated
dialogue data is costly and inefficient. In this paper, we propose a novel
turn-level active learning framework for DST to actively select turns in
dialogues to annotate. Given the limited labelling budget, experimental results
demonstrate the effectiveness of selective annotation of dialogue turns.
Additionally, our approach can effectively achieve comparable DST performance
to traditional training approaches with significantly less annotated data,
which provides a more efficient way to annotate new dialogue data.Comment: EMNLP 2023 Main Conferenc
Post-marketing safety surveillance of sacituzumab govitecan: an observational, pharmacovigilance study leveraging FAERS database
Background and objective: Sacituzumab govitecan (SG), the first antibody-drug conjugate targeting human trophoblast cell-surface antigen 2 (Trop-2), has been approved by the Food and Drug Administration (FDA) for the treatment of advanced or metastatic breast cancer and urothelial cancer. However, there is currently a dearth of information regarding the safety profiles of SG in a large sample cohort. The objective of the present study is to investigate SG-related adverse events (AEs) in real-world settings leveraging the FDA Adverse Event Reporting System (FAERS) database to guide the safety management of clinical medication.Methods: The FAERS database was retrospectively queried to extract reports associated with SG from April 2020 to March 2023. To identify and evaluate potential AEs in patients receiving SG, various disproportionality analyses such as reporting odds ratio (ROR), the proportional reporting ratio (PRR), the Bayesian confidence propagation neural network (BCPNN), and the multi-item gamma Poisson shrinker (MGPS) were employed.Results: Overall, 2069 reports of SG as the “primary suspect” were identified. Noteworthy, SG was significantly associated with an increased risk of blood lymphatic system disorders (ROR, 7.18; 95% CI, 6.58–7.84) and hepatobiliary disorders (ROR, 2.68; 95% CI, 2.17–3.30) at the System Organ Class (SOC) level. Meanwhile, 61 significant disproportionality preferred terms (PTs) simultaneously complied with all four algorithms were adopted. Therein, anemia, thrombocytopenia, neutropenia, leukopenia, diarrhea, asthenia, alopecia, and electrolyte imbalance were consistent with the common AEs described in the clinical trials and specification of SG. Furthermore, unexpected significant AEs include colitis (ROR, 12.09; 95% CI, 9.1–16.08), heart rate increased (ROR, 5.11; 95% CI, 3.84–6.79), sepsis (ROR, 4.77; 95% CI, 3.59–6.34), cholestasis (ROR, 6.28; 95% CI, 3.48–11.36), blood bilirubin increased (ROR, 4.65; 95% CI, 2.42–8.94) and meningitis (ROR, 7.23; 95% CI, 2.71–19.29) were also be detected. The median time to onset of SG-related AEs was 14 [interquartile range (IQR), 7–52] days, with the majority occurring within the initial month of SG treatment.Conclusion: Our study validates the commonly known AEs and also found some potentially emerging safety issues related to SG in real-world clinical practice, which could provide valuable vigilance evidence for clinicians and pharmacists to manage the safety issues of SG
Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)
This paper introduces the Life Scapes Reasoning Benchmark (LSR-Benchmark), a
novel dataset targeting real-life scenario reasoning, aiming to close the gap
in artificial neural networks' ability to reason in everyday contexts. In
contrast to domain knowledge reasoning datasets, LSR-Benchmark comprises
free-text formatted questions with rich information on real-life scenarios,
human behaviors, and character roles. The dataset consists of 2,162 questions
collected from open-source online sources and is manually annotated to improve
its quality. Experiments are conducted using state-of-the-art language models,
such as gpt3.5-turbo and instruction fine-tuned llama models, to test the
performance in LSR-Benchmark. The results reveal that humans outperform these
models significantly, indicating a persisting challenge for machine learning
models in comprehending daily human life
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
New Natural Langauge Process~(NLP) benchmarks are urgently needed to align
with the rapid development of large language models (LLMs). We present Xiezhi,
the most comprehensive evaluation suite designed to assess holistic domain
knowledge. Xiezhi comprises multiple-choice questions across 516 diverse
disciplines ranging from 13 different subjects with 220,000 questions and
accompanied by Xiezhi-Specialty and Xiezhi-Interdiscipline, both with 15k
questions. We conduct evaluation of the 47 cutting-edge LLMs on Xiezhi. Results
indicate that LLMs exceed average performance of humans in science,
engineering, agronomy, medicine, and art, but fall short in economics,
jurisprudence, pedagogy, literature, history, and management. We anticipate
Xiezhi will help analyze important strengths and shortcomings of LLMs, and the
benchmark is released in https://github.com/MikeGu721/XiezhiBenchmark .Comment: Under review of NeurIPS 202
- …